Localized Prediction of Continuous Target Variables Using Hierarchical Clustering

نویسندگان

  • Aleksandar Lazarevic
  • Ramdev Kanapady
  • Chandrika Kamath
  • Vipin Kumar
  • Kumar K. Tamma
چکیده

In this paper, we propose a novel technique for the efficient prediction of multiple continuous target variables from high-dimensional and heterogeneous data sets using a hierarchical clustering approach. The proposed approach consists of three phases applied recursively: partitioning, localization and prediction. In the partitioning step, similar target variables are grouped together by a clustering algorithm. In the localization step, a classification model is used to predict which group of target variables is of particular interest. If the identified group of target variables still contains a large number of target variables, the partitioning and localization steps are repeated recursively and the identified group is further split into subgroups with more similar target variables. When the number of target variables per identified subgroup is sufficiently small, the third step predicts target variables using localized prediction models built from only those data records that correspond to the particular subgroup. Experiments performed on the problem of damage prediction in complex mechanical structures indicate that our proposed hierarchical approach is computationally more efficient and more accurate than straightforward methods of predicting each target variable individually or simultaneously using global prediction models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Choosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation

1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...

متن کامل

Financial Time Series Forecasting with Grouped Predictors using Hierarchical Clustering and Support Vector Regression

Financial time series prediction is regarded as one of the most challengingtasks due to the inherent noise and non-stationality of the data. This paperproposed a two-stage financial time series prediction approach hybridizingsupport vector regression (SVR) with hierarchical clustering (HC). By averaging the variables within the clusters obtained from hierarchical clustering, we define super pre...

متن کامل

Power Prediction of Mobile Processors based on Statistical Analysis of Performance Monitoring Events

In mobile systems, energy efficiency is critical to extend battery life. Therefore, power consumption should be taken into account to develop software in addition to performance. Efficient software design in power and performance is possible if accurate power prediction is accomplished during the execution of software. In this paper, power estimation model is developed using statistical analysi...

متن کامل

Evaluating Different Approaches to Permeability Prediction in a Carbonate Reservoir

Permeability can be directly measured using cores taken from the reservoir in the laboratory. Due to high cost associated with coring, cores are available in a limited number of wells in a field. Many empirical models, statistical methods, and intelligent techniques were suggested to predict permeability in un-cored wells from easy-to-obtain and frequent data such as wireline logs. The main obj...

متن کامل

ProtoNet : Navigating the Hierarchical Clustering of the Protein Space

The ProtoNet site provides an automatic hierarchical clustering of the protein space. The clustering is based on an all-against-all BLAST similarity test. With this similarity measure we proceed to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a hierarchy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003